Fix parse_azure_endpoint passing query string to AsyncAzureOpenAI#231
Merged
gvanrossum merged 2 commits intomicrosoft:mainfrom Apr 10, 2026
Merged
Fix parse_azure_endpoint passing query string to AsyncAzureOpenAI#231gvanrossum merged 2 commits intomicrosoft:mainfrom
gvanrossum merged 2 commits intomicrosoft:mainfrom
Conversation
uv 0.10.x is current; the <0.10.0 constraint caused build warnings.
parse_azure_endpoint returned the raw URL including ?api-version=... which AsyncAzureOpenAI then mangled into invalid paths like ...?api-version=2024-06-01/openai/. Strip the query string before returning — api_version is already returned as a separate value and passed to the SDK independently.
This was referenced Apr 10, 2026
Collaborator
|
this break current setup, e.g. .env files containing URLs with api version information (which is the default in Azure Foundry setup). |
bmerkle
added a commit
to bmerkle/typeagent-py
that referenced
this pull request
Apr 13, 2026
Updated parse_azure_endpoint in utils.py:200 to strip the /openai/deployments/... path from the endpoint URL. Previously it only removed the query string, leaving the full deployment path — which AsyncAzureOpenAI then duplicated, causing the 404. Updated test_online.py to use create_chat_model() → model._model.request() — the same _make_azure_provider → AzureProvider(openai_client=AsyncAzureOpenAI(...)) code path used by the rest of the codebase.
gvanrossum
pushed a commit
that referenced
this pull request
Apr 14, 2026
bmerkle
added a commit
that referenced
this pull request
Apr 22, 2026
**Stack: 3/4** — depends on #229. Merge #231, #229, then this PR. --- - Add `add_terms_batch` and `add_properties_batch` to `ITermToSemanticRefIndex` and `IPropertyToSemanticRefIndex` interfaces - SQLite backend uses `executemany` instead of individual `cursor.execute()` calls (~1000+ calls per indexing batch reduced to 2-3) - Restructure `add_metadata_to_index_from_list` and `add_to_property_index` to collect all data first (pure functions), then batch-insert - Memory backend implements batch methods as loops for interface compatibility ## Benchmark ### Azure Standard_D2s_v5 -- 2 vCPU, 8 GiB RAM, Python 3.13 #### Indexing Pipeline (pytest-async-benchmark pedantic, 20 rounds, 3 warmup) Only the hot path (`add_messages_with_indexing`) is timed -- DB creation, storage init, and teardown are excluded. | Benchmark | Before (min) | After (min) | Speedup | |:---|---:|---:|---:| | `add_messages_with_indexing` (200 msgs) | 28.8 ms | 25.0 ms | **1.16x** | | `add_messages_with_indexing` (50 msgs) | 7.8 ms | 6.7 ms | **1.16x** | | VTT ingest (40 msgs) | 6.9 ms | 6.1 ms | **1.14x** | Consistent ~14-16% improvement -- `executemany` amortizes per-call overhead. <details> <summary><b>Reproduce the benchmark locally</b></summary> Save the benchmark file below as `tests/benchmarks/test_benchmark_indexing.py`, then: ```bash pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio # Run on main git checkout main python -m pytest tests/benchmarks/test_benchmark_indexing.py -v -s # Run on this branch git checkout perf/batch-inserts python -m pytest tests/benchmarks/test_benchmark_indexing.py -v -s ``` </details> --- *Generated by codeflash optimization agent* --------- Co-authored-by: Bernhard Merkle <bernhard.merkle@gmail.com>
bmerkle
added a commit
that referenced
this pull request
Apr 22, 2026
**Stack: 4/4** — depends on #230. Merge #231, #229, #230, then this PR. --- - Five call sites used `get_item()` per scored ref — one SELECT and full deserialization per match (N+1 pattern) - Added `get_metadata_multiple` to `ISemanticRefCollection` that fetches only `semref_id, range_json, knowledge_type` in a single batch query - Replaced the N+1 loop with one `get_metadata_multiple` call at each site - Further optimized scope-filtering: binary search in `contains_range`, inline tuple comparisons in `TextRange`, skip pydantic validation in `get_metadata_multiple` ### Call sites optimized 1. `lookup_term_filtered` — batch metadata, filter by knowledge_type/range 2. `lookup_property_in_property_index` — batch metadata, filter by range scope 3. `SemanticRefAccumulator.group_matches_by_type` — batch metadata, group by knowledge_type 4. `SemanticRefAccumulator.get_matches_in_scope` — batch metadata, filter by range scope 5. `get_scored_semantic_refs_from_ordinals_iter` — two-phase: metadata filter then batch fetch ### Additional optimizations - **Binary search in `TextRangeCollection.contains_range`**: replaced O(n) linear scan with `bisect_right` keyed on `start`, reducing scope-filtering from ~25ms to ~9ms - **Inline tuple comparisons in `TextRange`**: replaced `TextLocation` allocations in `__eq__`/`__lt__`/`__contains__` with a shared `_effective_end` returning tuples - **Skip pydantic validation in `get_metadata_multiple`**: construct `TextLocation`/`TextRange` directly from JSON instead of going through `__pydantic_validator__` ## Benchmark ### Azure Standard_D2s_v5 — 2 vCPU, 8 GiB RAM, Python 3.13 #### Query (pytest-async-benchmark pedantic, 200 rounds) 200 matches against a 200-message indexed SQLite transcript. Only the function under test is timed. | Function | Before (median) | After (median) | Speedup | |:---|---:|---:|---:| | `lookup_term_filtered` | 2.650 ms | 1.184 ms | **2.24x** | | `group_matches_by_type` | 2.428 ms | 978 μs | **2.48x** | | `get_scored_semantic_refs_from_ordinals_iter` | 2.541 ms | 2.946 ms | 0.86x | | `lookup_property_in_property_index` | 25.306 ms | 9.365 ms | **2.70x** | | `get_matches_in_scope` | 25.011 ms | 9.160 ms | **2.73x** | <details> <summary><b>Reproduce the benchmark locally</b></summary> ```bash pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio python -m pytest tests/benchmarks/test_benchmark_query.py -v -s ``` </details> --- *Generated by codeflash optimization agent* --------- Co-authored-by: Bernhard Merkle <bernhard.merkle@gmail.com>
Merged
bmerkle
pushed a commit
that referenced
this pull request
Apr 23, 2026
fixed regressions which were caused by #231 that only showed when using real API keys.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack: 1/4 — merge this first, then #229, #230, #232.
parse_azure_endpointreturned the full URL including?api-version=...AsyncAzureOpenAIappends/openai/toazure_endpoint, producing a mangled URL with the query string in the pathstr.split("?", 1)[0]before returningBenchmark
No performance impact — this is a correctness fix.
Generated by codeflash optimization agent